Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR for updated data model #182

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

jacksonj04
Copy link
Collaborator

@jacksonj04 jacksonj04 commented Nov 11, 2024

Our existing data model is starting to creak under the load, and needs a bit of a refresh. This ADR proposes a new structure for the data to better support future requirements.

@jacksonj04 jacksonj04 force-pushed the adr/document-data-structure branch 2 times, most recently from d8e9977 to 992bbe1 Compare November 11, 2024 11:05
@jacksonj04 jacksonj04 force-pushed the adr/document-data-structure branch 2 times, most recently from 9007447 to 720037a Compare November 14, 2024 09:49
This proposes a richer data structure to help us model how the various parts of "a decision" exist within the service, both logically and conceptually.
@jacksonj04 jacksonj04 force-pushed the adr/document-data-structure branch from 720037a to 7ede3ce Compare November 14, 2024 09:54
@jacksonj04 jacksonj04 marked this pull request as ready for review November 14, 2024 09:54

Documents are the overarching item in the data structure, and are what most people will actually mean when they talk about a "judgment". Each document MUST be assigned a unique, non-semantic identifier by the Find Case Law service, and may have one or more other identifiers such as NCNs.

Where a relationship exists between two documents (eg "X is a press summary of Y") this relationship would likely be stored bidirectionally, ie "is summarised by" and "is a summary of" to simplify retrieval.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth being specific that a judgment and its press summary are different documents, and that we don't yet have a settled opinion on language.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the relationship of revisions to its document look like?


A revision represents a distinct submission of a document to the National Archives, usually by a court or tribunal. For new submissions this will usually be via TDR, but some legacy ingestions may have been done via other means.

A revision SHOULD have a "source document" which we consider to be the canonical representation of the revision, and from which all other representations are derived. This will usually be a .docx file for all new submissions, but could also be other types of file for legacy ingestions or future submissions. It is possible that legacy ingestions will no longer have the original file available for all past revisions (although this will remain in The National Archives' preservation system).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the source document in practice a link to S3 and maybe a hash of that file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants